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To conserve species, we must first identify them. Field researchers, land managers, educators, and citizen scientists need up-to-date and acces- 
sible tools to identify organisms, organize data, and share observations. Emerging technologies complement traditional, book-form field guides 
by providing users with a wealth of multimedia data. We review technical innovations of next-generation field guides, including Web-based and 
stand-alone applications, interactive multiple-access keys, visual-recognition software adapted to identify organisms, species checklists that can be 
customized to particular sites, online communities in which people share species observations, and the use of crowdsourced data to refine machine- 
based identification algorithms. Next-generation field guides are user friendly; permit quality control and the revision of data; are scalable to 
accommodate burgeoning data; protect content and privacy while allowing broad public access; and are adaptable to ever-changing platforms and 
browsers. These tools have great potential to engage new audiences while fostering rigorous science and an appreciation for nature. 
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A ccurate species identification is a crucial prerequisite 

to documenting, managing, and sustaining the diversity 
of life on Earth. Basic identification begins with field guides: 
clearly presented syntheses of technical information about 
taxa, paired with identification keys for distinguishing among 
taxa. Close observation and illustrated, printed field guides 
are the traditional tools for organism identification. Now, 
myriad new electronic tools are emerging to help everyone 
from curious novices to seasoned biologists identify existing 
species; determine whether a species is new to science; and 
share observations, data, and discoveries with the wider world 
(Stevenson et al. 2003, Agarwal et al. 2006). 

Here, we explore how field guides are overcoming the lim- 
itations of bound books by evolving in tandem with infor- 
mation technology. We review a broad range of applications 
(or apps) and detail several case studies to illustrate how 
next-generation field guides are facilitating the identification 
of organisms; being used to create customized guides to the 
flora and fauna of particular sites; promoting networking 
among a new generation of naturalists; enabling the collec- 
tion and sharing of valuable scientific data; and encourag- 
ing interdisciplinary research in biology, computer science, 
education, and cognition. We discuss emerging innovations 
that are yielding especially successful apps in terms of their 
accuracy; ease of use; and ability to stimulate learning, par- 
ticularly among nonscientists. We also highlight important 
challenges for ensuring that next-generation field guides 
and their associated resources can successfully reach their 
intended audiences, adapt to and evolve with new informa- 
tion and technologies, and be sustained over the long term. 
Although our focus is on field guides to organisms, clas- 
sification systems are central to the practice of any science. 



The innovations and challenges that we discuss here can 
inform the creation of user-friendly, rigorous guides to items 
from molecules to Messier objects. 

What is a field guide? 

The term field guide broadly encompasses geographically 
restricted or taxonomically constrained ( pragmatic ) checklists, 
monographic treatments of particular taxa, comprehensive 
descriptions of regional natural communities, textbooks, 
nontechnical illustrated posters, flashcards, or brochures, 
as well as hybrids of these (Hawthorne and Harris 2006). 
The earliest field guides were created before the fourteenth 
century and were illustrated, utilitarian descriptions, such 
as herbals (Givens 2006), but until the invention of mov- 
able type, these works were reproduced rarely and saw only 
limited distribution. The expense of illustrations precluded 
widespread publication of biological compendia; illustra- 
tions were often eschewed in favor of technical text in the 
form of dichotomous keys (Scharf 2009). Books contain- 
ing such keys burgeoned in the eighteenth century as the 
number of newly described species increased exponentially 
(Scharf 2007). 

The first field guides with contemporary characteristics — 
detailed descriptions of species, with illustrations, clear 
taxonomic organization, and prose accessible to the lay 
public — were published in the early 1900s (for a review, 
see Dunlap 2005). Today, field guides in wide use by both 
professional and amateur scientists typically consist of two 
sections: (1) an overview of the broad group of organisms 
of interest, including tips for accurately observing them, 
their evolutionary relationships, and keys with which to 
identify them, and (2) species accounts, featuring descriptive 
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images, range information, ecology, behavior, habitats, tax- 
onomy, and synonymy (Stevenson et al. 2003). 

Although many field guides are written for general audi- 
ences, they are also being used increasingly as authoritative 
references for species identification. For example, Schmidt 
(2006) found a nearly fivefold increase in citations of field 
guides — from 50 to 248 per year — in scientific publications 
between 1990 and 2004. Likewise, consumer demand for field 
guides continues to grow. In 2000, a search of Amazon.com 
book titles for the keywords field guide yielded 625 total 
entries (Stevenson et al. 2003); in 2012, a we ran a similar 
search, which yielded 1849 titles, with 81 forthcoming and 
published titles for 2012-2013 alone. Of these latter pub- 
lications, 29 covered birds; 19 dealt with plants and fungi; 
10 were on mammals and herpetofauna; 8 were about inver- 
tebrates; 7 were site-based treatments of multiple types of 
organisms; and 8 addressed gems, fossils, animal tracks, 
weather, or astronomical objects. 

The classical bound field guide has well-known limita- 
tions. Books are relatively expensive, not always portable, 
and their information can be updated only when new edi- 
tions are published. Crucial features for identifying many 
animals, including sounds (e.g., bird calls), movement pat- 
terns, or behavioral characteristics can only be approximated 
in print but are communicated much more easily with video 
or audio media. These and other multimedia features are 
increasingly included as online supplements to printed field 
guides or as stand-alone, entirely digital field guides. 

Digital portals also invite active participation by users in 
ways that a book cannot offer. Near-universal inclusion of 
Global Positioning System (GPS) capabilities into handheld 
devices lets people instantly record where they sighted an 
organism and simultaneously contribute their observations 
to online communities, social networks, and repositories 
of scientific data. Educators have long recognized that K-12 
and college students are comfortable with — and quickly 
master — such sophisticated hardware and software (Ellis 
1984, Tapscott 1998). New technologies are frequently used 
by educators seeking to better engage students in studying 
science, technology, engineering, and mathematics and to 
involve volunteers in citizen-science projects (Kress 2004, 
Newman et al. 2012). In fact, a recent meta-analysis showed 
that using new instructional technologies and fostering 
student collaboration in scientific inquiry both have positive 
impacts on achievement (Schroeder et al. 2007). 

The evolution of online field guides and 
identification keys 

Digital identification tools have increased both in number 
and computational sophistication in the past several decades 
(Dallwitz et al. 2013). Early punch-card keys ( polyclaves ) 
allowed users to narrow down a set of species by physi- 
cally aligning those that had matching character states 
(e.g., Simpson and Janos 1974). Basic computational features 
developed in the late 1960s included the ability for a user 
to choose the characters used for keying (i.e., relevant, 



observable characters, not a constrained series of steps); the 
ability to enter numeric values for character states; built-in 
ranges of data to account for uncertainty, user error, or 
polymorphisms in character states; and the retention of taxa 
during keying when data were missing for particular char- 
acter states (Goodall 1968). Later, digital keys offered users 
guidance on the most informative questions to answer for 
particular taxa (Morse 1971) and also made it clear when 
certain character states were not applicable or were contin- 
gent on the states chosen for other characters (Pankhurst 
1991). Many digital keys were — and are — types of expert 
systems (a term from artificial intelligence), in which back- 
ground knowledge of particular taxa (the heuristic knowl- 
edge that many experienced taxonomists have) has been 
hard-coded into the programming, thus proffering the most 
informative questions first to the user in a semiguided key- 
ing process (Edwards and Morse 1995). Many programs also 
tend to choose species on the basis of positive matches across 
a range of character states rather than by eliminating taxa 
from the results set if one or more of their character states 
do not fit the data or if a full set of data is missing. 

Today, the emphasis is on usability for novices and experts 
alike, the versatility of platforms and presentation, and accu- 
rate data. Stable technology supports Internet-based and 
stand-alone apps for smartphones, tablet computers, and 
other portable devices. Users can search online keys for spe- 
cific taxa or characteristics. They can also follow breadcrumb 
trails — a navigation tool metaphorically similar to their 
namesake — to retrace their steps if they go astray. Online 
keys often use a variety of media, including text, drawings, 
photographs, audio, and video, set in visually appealing user 
interfaces that facilitate taxon identification with a mini- 
mum of steps (Leggett and Kirchoff 2011). Pop-up windows 
or hyperlinked glossaries define technical terms. The clever 
use of multimedia and machine-learning algorithms makes 
stand-alone apps and online guides simultaneously acces- 
sible to beginners and useful for specialists. Together, these 
features increase the accuracy of identifications, offer rewards 
for the user, and encourage learning. 

Creating digital field guides 

Several software packages and online resources allow people 
to create digital keys for organisms using a standard taxon x 
character-state data matrix (table 1). Some programs, includ- 
ing Intkey, Identifylt, Linneaus II, Lucid, MEKA, NaviKey, 
PollyClave, XID, and xPer2 are stand-alone freeware or pro- 
prietary software (Dallwitz 2011). Others, including ActKey 
(Brach and Song 2005), eFloras (Brach and Song 2006), and 
Stinger’s Lightweight Interactive Key Software (SLIKS) run 
as Web-based apps. Technical reviews of many of these pack- 
ages have been published elsewhere (Edwards and Morse 
1995, Dallwitz 2011). 

More recently, the Electronic Field Guide (EFG; http:// 
efg.cs.umb.edu/efg-, table 1) Project of the University of 
Massachusetts Boston has developed software that allows 
scientists and lay people to make their own Web-based, 
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Table 1. Software for creating customized field guides. 


Software 


Web address 


The Electronic Field Guide 
Project 


http://efg.cs.umb.edu/efg 


Linnaeus II 


www.eti.uva.nl/products/linnaeus.php 


Horticopia 


www.horticopia.com 


Lucid Key Server 


www.lucidcentral.com/en-us/software/lucidkeyserver.aspx 


SLIKS (Stinger's Lightweight 
Interactive Key Software) 


www.stingerspiace.com/SLiKS 


The XID Authoring System 


http://xidservices.com 


Xper2 


http://lis-upmc.snv.jussieu.fr/lis/?q=en/resources/software/xper2 


Intkey 


http://delta-intkey.com/www/programs.htm 



digital field guides without the restrictions imposed either 
by paper formats or by commercial considerations (Morris 
et al. 2007). Like most electronic field-guide generators, the 
initial efforts of the EFG Project were focused on develop- 
ing pictorial or text-based keys that allow users to choose 
different views and ways of accessing them. The subsequent 
efforts were more flexible. In the Microsoft Windows- and 
Linux-compatible EFG2 software, a simple text file consist- 
ing of a taxon x character-state matrix is used to organize 
and store text-based information. This master text file also 
contains pointers to a folder that includes files in a variety 
of media, including illustrations, digital photographs, maps, 
video clips, and sounds. While creating or updating guides, 
the user can drag and drop files into the folder to import new 
content. The final product includes Web displays of taxon 
pages, configurable lists of taxa, and browse-and-search 
modes. End users have found this model (in which a master 
file points to a folder) easy to learn and sufficiently flexible 
to enable quick construction of custom field guides. Over 
30 customized guides, covering a range of taxa and regions, 
have been produced to date using EFG2. 

Customizing field guides for particular situations 

EFG is a generic platform for making field guides. With 
an existing field guide in hand, users can easily adjust the 
species set to fit their locality or interest. The central con- 
cept of a customizable field guide is the local list a subset 
of information extracted from a much larger database but 
restricted to, for example, a particular location, habitat, 
time of year, time of day, or observation method. The key to 
a workable local list is a large database with a flexible struc- 
ture. For example, the New England Wild Flower Society’s 
Go Botany database structure is generic and extensible 
( http://gobotany.newenglandwild.org ; also see supplemen- 
tal table SI, available online at http://dx.doi.org/10.1525/ 
bio.2013.63.11.8, for links to all of the guides discussed 
here). The user interface and species list can potentially 
be customized for any flora, and the New England Wild 
Flower Society is working with five institutional partners 
to create floras for their regions. 

Likewise, the Web-based, open-source Atrium biodiversity 
information system ( www.atrium-biodiversity.org ) includes a 



digital herbarium, a geographic infor- 
mation system data repository, a biblio- 
graphic reference-management system, 
a meteorological data module, and a 
module for managing and analyzing 
quantitative vegetation survey data. 
Users can browse or filter collection 
records by taxonomy, collector, project, 
or geographic region. Detailed collec- 
tion data, high-resolution zoomable 
images of fresh plant material and 
preserved specimens, maps of collec- 
tion localities, and multilevel taxon 
descriptions are all viewable. All data 
and images are downloadable and can be reused in custom- 
ized guides; the Field Guide Generator utility is one of the 
most popular modules in Atrium. 

Behind the scenes: The importance of database 
structures and semantics 

The utility of any digital field guide is dependent not only 
on the availability and accuracy of species-level data but 
also on a common (and formal) language ( ontology ) used 
to specify a set of core concepts and ideas (e.g., Walls et al. 
2012). Such formal structures are especially important for 
field guides covering many taxa with multiple-access keys 
(modern-day polyclaves). As taxon x character-state matri- 
ces get very large (more than 500 taxa [rows] or more than 
50 character states [columns]), searching slows down, and 
errors are increasingly likely to occur in data input. Most 
developers shift to relational databases (rDBs) to handle 
large numbers of taxa, to increase the database query speed 
and to reduce the number of repetitive entries. For example, 
in an rDB, a genus name need only be entered once; species 
entries then point to the genus table to capture higher-level 
characters. 

Relational databases (and the digital field guides based 
on them) use a semantic data model, in which instances 
(e.g., red oak) are defined for types (e.g., tree). Several such 
standardized semantic frameworks exist for taxonomy: the 
Description Language for Taxonomy (DELTA), Lucid, and 
Nexus (Dallwitz 2010). These frameworks work with many 
types of data: ranked or unranked multistate categorical 
variables, continuous or discrete numbers, or even free text. 
These data not only serve to record machine-readable char- 
acter states for various taxa so that species can be distin- 
guished but can also be parsed to provide natural-language 
descriptions of taxa. Finally, common semantics-based 
systems allow data to be exchanged among different apps 
( interoperability ), which can allow end users to create new 
types of field guides that could, for example, combine the 
data for multiple higher-level taxa in a single habitat. 

Relational databases also provide field-guide authors 
with easy-to-use interfaces for updating or correcting data 
as new information becomes available. For example, as 
taxonomic names are changed or new distribution records 
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are discovered, a single new entry in an rDB can ensure 
that the field guide and keys remain definitive sources for 
accurate names and biogeography. Now that any taxon can 
be described or revised in online-only publications (Labeda 
and Oren 2008, Knapp et al. 2011, Zhang 2012), online iden- 
tification keys, updated regularly and even in near real time, 
will supplant printed keys. 

An important challenge for any digital field guide is its 
scalability: the capacity for a database both to accommodate 
and to rapidly deliver information — such as character states, 
images, and other linked data — for tens of thousands or even 
millions of diverse taxa. Physical space (memory) is rarely 
an issue for online (Web-based) guides, but it is for stand- 
alone apps loaded on small mobile devices, whose limited 
processor speeds also increase the time required to search 
through vast databases. Formal ontologies and semantics 
can help speed the searching and parsing of large databases. 
Graph databases, in which items in the database are linked 
together — like computers on the World Wide Web or neural 
nets (Edwards and Morse 1995) — using nodes and the paths 
between them instead of the look-up index of an rDB, have 
the as-yet-unexplored potential to further speed the search 
of digital field guides with multiple-access keys and very large 
databases (Angles and Gutierrez 2008). 

Features of interactive Web-based identification keys 

We compiled a list of 50 species-identification tools that are 
available online (table SI); all but 10 were Web based (i.e., 
not stand-alone apps). Although this list cannot be exhaus- 
tive (identification apps are burgeoning), the entries illus- 
trate the range of features currently available in online field 
guides. After summarizing these features (see table SI for a 
concise summary and Web addresses for all sites and apps 
discussed), we illustrate specific attributes with reference to 
the resources that we have developed. 

All but four of the digital identification keys (96%) pro- 
vide detailed data on specific taxa, including range maps, 
information on life histories, and distinguishing character- 
istics. Thirty-nine (76%) enable users to search or browse 
for a particular species of interest. Almost half (43%) offer a 
glossary of technical terms or a dedicated help page with tips 
for usage. Twenty (39%) allow users either to upload data 
to a central repository or to share data with selected other 
users. Of the 29 apps that feature identification keys, 18 have 
multiple- access keys, and 14 offer dichotomous keys. 

Go Botany is one example of a resource that has all of 
these features (figure 1). Go Botany is a free suite of Internet- 
based interactive identification keys and learning tools that 
runs on desktop computers, laptops, tablets, and mobile 
devices. Dynamic multiple-access keys and more-technical, 
clickable dichotomous keys appeal to novices and experts, 
respectively. Both types of keys allow users to track and change 
their path to identification using breadcrumb trails. Several 
aspects of the display, including autoprompting search tools 
and virtual display cases showing thumbnail images of plants, 
are adapted from familiar formats originally developed for 



e-commerce (Matt Beige, Vision and Logic LLC, personal 
communication, 23 June 2013); therefore, even novice users 
learn how to use the app very quickly. Alternatively, users can 
select a candidate species set by directly choosing the plant 
family or genus; family and genus information pages help the 
users learn higher levels of taxonomic organization, which is 
useful in formal botany courses. 

Go Botany includes more than 3500 plant taxa native and 
naturalized to New England (Haines 2011) and is built from 
an extensive database covering plant morphology, habitat 
affinities, synonymies, look-alike taxa, and species distribu- 
tions. Over 37,000 photographs, technical drawings, and 
range maps illustrate the keys and species information pages. 
This wealth of detail presents a key challenge for Go Botany 
and other online multiple-access keys for many taxa: The 
application requires a connection to the Internet, because a 
stand-alone app would use most of the storage capacity cur- 
rently available on handheld devices. 

Stand-alone mobile apps 

Smartphones and tablets are nearly ubiquitous. These por- 
table devices have increasingly large amounts of storage; 
built-in, GPS-enabled cameras; and instant connectivity to 
social networks and networks of experts. Many field guides 
can be stored on a single smartphone and used as identifi- 
cation manuals, study guides, and data-collection devices. 
Hundreds of commercial apps are available for purchase 
or free download, covering a wide array of species; a recent 
(7 August 2013) search yielded 25 apps for plant identifica- 
tion alone on iTunes, 9 of which are free. Many apps exhibit 
a variety of features, including identification tools that use 
simple icons representing character states, information 
on each taxon, and the ability to network with others and 
instantly upload sightings (table SI). 

The Guide to Texas Range and Pasture Plants, from the 
Botanical Research Institute of Texas (BRIT), ( www.brit . 
org/rangeplants ) is an example of a simple, image-based 
plant-identification system aimed at the general public. The 
BRIT guide is an inexpensive app with which rural students, 
farmers, ranchers, and naturalists can view and study images 
of herbarium specimens to identify 129 species of range 
plants. The guide provides images; nomenclature; pronun- 
ciation guides; and a brief description of each plant that 
includes its growth season, its value for wildlife and grazing 
animals, and data on whether it is native. Users can review 
species with a flashcard feature or can test themselves with 
identification quizzes. 

Computer-based visual recognition is used to identify spe- 
cies in another way; a person sees an organism, photographs 
it, and queries a database for the identity of the resulting 
image (MacLeod 2008). Rapid advances in using visual- 
recognition software are yielding automated systems for 
identifying plants, insects, vertebrates, and benthic inverte- 
brates (e.g., Gobi 2010, Lytle et al. 2010). Leafsnap (figure 2; 
http://leafsnap.com ) is a widely used visual-recognition app 
for identifying trees in the northeastern United States; it was 
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Figure 1. The species-identification page of Go Botany asks the user to answer simple questions about the plant that is to 
be identified. The user clicks on a question in the left frame, which opens a dialogue box with a question and a helpful 
hint for observing the specimen. All botanical terms (in this example, node) are provided with an illustrated definition 
on rollover, and the choices of character states are simply illustrated with diagrammatic drawings. In the background are 
photos that show images of the species in the results set. The user can also click “Get more questions” to be provided with 
more questions about features the user can actually see on the plant of interest. Source: New England Wild Flower Society, 
http://gobotany.newenglandwild.org. 



developed by the Smithsonian Institution, in partnership 
with Columbia University and the University of Maryland. 
Leafsnap emphasizes interactivity: The user takes a photo- 
graph of a single leaf, using the built-in camera of their 
iPhone or iPad, and Leafsnap then compares the photograph 
to a central library of more than 9000 images stored in a 
remote database. Leafsnap automatically determines the 
contours of the leaf and uses visual-recognition software 
to find a match for it in the database (Agarwal et al. 2006, 
Belhumeur et al. 2008); results are returned to the user in 
5-20 seconds, depending on the speed of the network con- 
nection. Next, Leafsnap brings up high-resolution images of 
the leaf, along with images of the species’ flower, fruit, seeds, 
and bark. The app also supplies background information 
on the species and its geographic distribution. When the 
identification is not straightforward, Leafsnap users dig into 
other related images in its database, such as fruit shape or 



leaf venation patterns. In the end, it is up to the user to make 
the correct determination of the species, which reinforces 
botanical learning. Once a user successfully identifies a tree, 
his or her photograph and accompanying GPS location 
data are automatically uploaded into Leafsnap’s database, 
contributing to the work of a community of scientists who 
are using the stream of data to map and monitor how the 
abundances and geographic ranges of different tree species 
are changing through time and as a function of climatic 
change. 

Building online communities to enhance public 
engagement with science 

Widespread popular interest in natural history and the 
availability of next-generation field guides is facilitating the 
growing engagement of citizen scientists with the profes- 
sional scientific community (Newman et al. 2012). Apps such 
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as Leafsnap, nationwide efforts such as the USA National 
Phenology Network (Schwartz et al. 2012), and international 
projects such as the Tropical Ecology Assessment and 
Monitoring Network (Andelman 2011) are generating high- 
quality, georeferenced data on species’ diversity, shifting 
distribution patterns, and life histories. Go Botany has a 
citizen-science portal, called PlantShare, where plant enthu- 
siasts can share sightings, get crowdsourced or expert advice 
on identifying plants, and create checklists. The Cornell Lab 
of Ornithology’s eBird Web site ( http://ebird.org/content/ 
ebird ) encourages users to report bird sightings for use in 
a citizen-science project documenting changes in species 
ranges. 

Similarly, BugGuide [http://bugguide.net), an online, taxo- 
nomically organized, and image-rich guide to the insects and 
arthropods of North America, is a resource for people who 
enjoy learning about and sharing observations of insects and 
other arthropods. From BugGuide’s launch in 2003 to the 
end of 2012, it reached more than 4 million unique visitors; 
image submissions have increased even faster. BugGuide 
currently includes more than 26,000 species on more than 
42,000 pages and nearly 580,000 images that have been con- 
tributed by nearly 22,000 individuals. A distinctive aspect of 
BugGuide is the ability of a user to request an identification 
of an unknown specimen; it is 1 of only 2 of the 16 appli- 
cations that we reviewed that does so (table SI). Volunteer 
editors and taxonomic experts monitor the request queue 
and identify the specimens (i.e., images); most are identified 
within 1 day of submission. After identification, the images 
are moved out of the request queue and into BugGuide 
itself, ending up on individual-taxon information pages cre- 
ated by a BugGuide editor. Each taxon is arranged within a 
taxonomic hierarchy, and each level of that hierarchy has 
its own information page. These pages contain contrib- 
uted images and information on a range of topics, includ- 
ing species diversity, key characteristics, distribution, and 
ecological characteristics, which can be used to generate 
distribution maps and summaries of phenological informa- 
tion. BugGuide has become a popular online resource for 
enthusiasts of the study of insects, not only because of its 
content and ease of use but also because of its welcoming 
atmosphere for both scientists and citizen scientists. 



Discover Life (www.discoverlife.org) also links next- 
generation field guides with citizen science to yield high- 
quality scientific data. This integrated science and education 
platform, currently used by more than 350,000 people 
every month, enables users to collect and analyze data on 
the identity, distribution, and abundance of organisms; to 
conduct original research; and to learn science. Discover 
Life’s integrated tools include more than 600 multiple-access 
identification guides to, and checklists for, groups including 
plants, vertebrates, fungi, and many arthropods; a global 
mapper that displays the distribution of more than 480,000 
species; and quantitative tools to assess changes in pheno- 
logy. User-created albums enable contributors to manage 
the data associated with photographs, to map where they 
photographed a species, and to maintain a digital list of 
their contributions. Since March 2010, for example, through 
Discover Life’s moth project, phenological data on more 
than 1200 species from 130,000 photographs from North 
America and Costa Rica have been identified and analyzed. 

With any data-collection effort, ensuring verifiable con- 
tributions is paramount. Discover Life research protocols 
require participants to include photographs of the time and 
date on their cell phone, of a GPS display, and of landmarks 
to confirm that the time and location are correct. Novices, 
experts, and computer algorithms work in concert to name 
specimens and correct errors associated with observations. 
Discover Life, eBird, and BugGuide, among many others, 
employ both professional and citizen scientists as modera- 
tors or gatekeepers for new data. As the number of users and 
the volume of their contributed data increases, more mod- 
eration will be needed. It is especially crucial to ensure that 
sensitive data on rare species (such as the locations of taxa 
vulnerable to poaching) and information that personally 
identifies specific users are protected. 

Improving online tools with machine learning and 
crowdsourcing 

Rather than simply consulting a printed field guide for 
help with identification, many people now attempt online 
searches. For example, more than 8 million people annually 
visit the Cornell Lab of Ornithology’s All About Birds Web 
site ( http://allaboutbirds.org ); many use the search box in 





Figure 2. The mobile application Leafsnap consists of a number of interactive screens that provide the user with 
information about tree species and with the ability to automatically identify a tree by taking a photograph of an isolated 
leaf, (a) Leafsnap Home screen, (b) Leafsnap provides two types of games to hone the skills of the user in identifying species 
from leaves, flowers, or fruits, (c) Green Sweep challenges the user to place a moving leaf into the correct species box. 

(d) The Browse mode allows the user to scroll through thumbnails of leaves, flowers, or fruits of the species included in the 
application; species can be sorted by common or scientific name, (e) For each species selected in the Browse mode, high- 
resolution photographs of all parts of the plant are illustrated to help in identifications, (f) The Browse mode also provides 
a short text description of the plant, (g) The Snap It! mode allows users to take a photo on a white background of a leaf of 
the unidentified species, (h) The shape of the leaf is automatically separated from the background and sent to the home 
server; within a short time, a list of prioritized identifications is sent back to the user, (i) Once the proper identification is 
selected by the user, the name of the species and the location coordinates are recorded in the user’s own collection page in 
Leafsnap and on the central Leafsnap server. Source: W. John Kress, Leafsnap. 
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All About Birds as an identification tool, typing descriptors 
such as “small bird with black stripe.” But search engines are 
not built as identification tools (the semantics differ) and 
often return incomplete or misleading results. Therefore, 
the Cornell Lab of Ornithology is developing the Merlin 
bird-identification tool. Merlin asks people questions about 
their sightings and uses citizen-science data, crowdsourced 
descriptors, and artificial intelligence algorithms to enhance 
the semantic definitions within All About Birds, thereby 
improving the search results. Merlin first asks users when 
and where they saw the bird, then taps into the eBird citizen- 
science database, which contains more than 100 million 
observations from birders. Merlin thus narrows the number 
of possible species seen by 75% or more to the set of taxa 
most likely to be encountered at any given location and 
time of year. Merlin then helps users further refine their 
identifications on the basis of attributes they saw, such as 
color, size, and behavior. However, people notice, recall, and 
describe the same details many different ways, including 
ways that do not match the descriptors in a database pro- 
vided by experts (e.g., Kempen and Tredoux 2012); the end 
result can be a misidentification. To improve the returns of 
searches based on variable responses, Merlin uses artificial 
intelligence algorithms to consider a user’s prior responses 
to inform the next question it asks, in much the same way 
that the Go Botany algorithm does. It uses probabilities to 
tap into a database populated with expert descriptors and 
crowdsourced descriptors gathered through online activities 
such as Mark My Bird, in which people describe the traits 
of birds on the basis of photos. Similar to how they do with 
Leafsnap, machine-learning algorithms enable Merlin to get 
“smarter” and return more-accurate identifications as more 
people use it. 

Merlin uses visual-recognition software developed by 
the Visipedia project ( www.vision.caltech.edu/visipedia ), which 
analyzes crowdsourced data to help computers recognize 
objects in images. Massive amounts of data can be gathered in 
a short time by engaging the online communities of citizen- 
science projects such as eBird, social media, and other Web 
sites. Photographers have contributed more than 80,000 
annotated photographs of birds, which were used to develop 
the visual recognition system. Mark My Bird and other online 
activities have gathered more than 250,000 rounds of data in 
6 months from volunteers who are “teaching” Merlin about 
the color, size, and shape of birds as the public perceives and 
describes them. If Merlin proves successful, the techniques 
used to develop it will be adapted for other taxa, providing a 
new generation of online identification tools. 

Bringing your field guide to the public 

Field guide developers may think that they have produced 
the perfect app, but people will use it in unpredictable 
ways (such as using search entry fields to look for “black 
ants with large gasters”). Ultimately, the process of creating 
a next-generation field guide that communicates reliable 
information and that people will want to use depends on 



four fundamental steps: (1) clearly identifying the target 
audience; (2) conducting iterative user testing with that 
audience during the design and development of the app; 
(3) ensuring that the data are protected from inappropriate 
reuse; and (4) building long-term resources to maintain the 
app, update the data, and respond to changing technology. 

Although user testing is a sine qua non of software and 
Web-site development, opaque and difficult to navigate 
Web sites continue to crowd the Internet. In clearly articu- 
lating the characteristics of the archetypal user of a next- 
generation field guide, it is useful to develop a persona 
( sensu Cooper et al. 2007) based on interviews of prospec- 
tive users or empirical evidence that describes the user’s 
motivation, computer expertise, and level of experience 
with the taxonomic group and terminology. This persona 
should be a portrait sufficiently rich that the design team 
can determine whether a user matching the persona would 
use a certain feature. Iterative, objective user testing must be 
conducted to help refine and improve preliminary designs 
(wireframes) for a field guide. In such tests, the wireframes 
are shown to a set of users — not affiliated with the project — 
who broadly represent the personas identified by the design 
team. User tests are reality checks in the design process: 
They help the design and programming team overcome 
internal biases and ensure that the application will be user 
friendly and widely adopted. 

Copyright protection for intellectual property, including 
photographs and illustrations, should be in place, and 
image contributors must have mechanisms for permitting 
the reuse of their work. Creative Commons licenses are fre- 
quently used as a means to control the reuse of proprietary 
data, and sets of best practices for noncommercial uses are 
available (Hagedorn et al. 2011). Before building any new 
Web app, the developers should research existing patents 
to make certain that unintentional infringement does not 
occur, especially if the product is intended for sale. 

The Web can be a place where good ideas go viral and 
flourish or a burying ground for obsolete information or 
Web sites. Developing versatile apps that are adaptable to 
changing platforms, browsers, operating systems, and other 
software is an expensive, long-term enterprise. Long-term 
support is crucial to ensure the longevity of a next-generation 
field guide; it is also crucial early in the life of a project to 
articulate a vision for generating income in order to guar- 
antee a long life for the product. Start-up funding, such as 
grants provided by the National Science Foundation, can 
jump-start a new initiative, but it is typically temporary. 
Users can be engaged, Wikipedia style, to contribute time 
and money to sustaining Web sites, but donations can decline 
as interest wanes. Citizen scientists and other users are most 
likely to stay engaged when the scientists who developed 
the application communicate clearly how user-generated 
data improve science. Bearing these considerations in mind, 
electronic field guides are essential, evolving tools that will 
transform how professional and amateur biologists collabo- 
rate to identify new species and move science forward. 
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